Controlling configurable peak performance limits of a processor

ABSTRACT

In one embodiment, the present invention includes a processor having a plurality of cores each to execute instructions, a non-volatile storage to store maximum peak operating frequency values each a function of a given number of active cores, a configuration storage to store frequency limits each corresponding to one of the maximum peak operating frequency values or a configurable clip frequency value less than the maximum peak operating frequency value. In turn, a power controller is configured to limit operating frequency of the cores to a corresponding frequency limit obtained from the configuration storage. Other embodiments are described and claimed.

This application is a continuation of U.S. patent application Ser. No.13/785,247, filed on Mar. 5, 2013, which is a continuation of U.S.patent application Ser. No. 13/724,732, filed Dec. 21, 2012, the contentof which is hereby incorporated by reference.

BACKGROUND

Advances in semiconductor processing and logic design have permitted anincrease in the amount of logic that may be present on integratedcircuit devices. As a result, computer system configurations haveevolved from a single or multiple integrated circuits in a system tomultiple hardware threads, multiple cores, multiple devices, and/orcomplete systems on individual integrated circuits. Additionally, as thedensity of integrated circuits has grown, the power requirements forcomputing systems (from embedded systems to servers) have alsoescalated. Furthermore, software inefficiencies, and its requirements ofhardware, have also caused an increase in computing device energyconsumption. In fact, some studies indicate that computing devicesconsume a sizeable percentage of the entire electricity supply for acountry, such as the United States of America. As a result, there is avital need for energy efficiency and conservation associated withintegrated circuits. These needs will increase as servers, desktopcomputers, notebooks, Ultrabooks™, tablets, mobile phones, processors,embedded systems, etc. become even more prevalent (from inclusion in thetypical computer, automobiles, and televisions to biotechnology).

In some software applications, individual processor performancevariability across nodes of a compute cluster can result in softwarefailures. At the same time, the nature of modern processors is to takeadvantage of environmental capacity such as power or thermal constraintsand increase processor clock frequency until one or more of these limitsare reached. With die-to-die silicon variation, processor operation isgenerally non-deterministic. The solution for many users who seek tonormalize performance across nodes is to disable altogetheropportunistic turbo mode operation in which clock frequencies of aprocessor are increased. While this can more readily ensure determinismof operation across the nodes, a significant amount of performance islost.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram of a system in accordance with one embodimentof the present invention.

FIG. 2 is a block diagram illustrating a configurable peak performancelimit control mechanism in accordance with an embodiment of the presentinvention.

FIG. 3 is a flow diagram of a method for dynamically limiting processorfrequency in accordance with an embodiment of the present invention.

FIG. 4 is a block diagram of a processor in accordance with anembodiment of the present invention.

FIG. 5 is a block diagram of a processor in accordance with anembodiment of the present invention.

FIG. 6 is a block diagram of a system in accordance with an embodimentof the present invention.

DETAILED DESCRIPTION

In various embodiments, peak performance levels of a processor can becontrolled in a manner to achieve some turbo mode performance upsidewithout the variability typically associated with it. In general,processor turbo mode operation is implemented with control algorithmsthat maximize performance below a package level power budget such thatwhen budget is available, one or more domains of a processor can operateat a frequency greater than a guaranteed maximum frequency. Embodimentsmay be particularly applicable in two scenarios: maximizing processorcore frequency when applications execute at generally low power levels;and maximizing processor core frequency when applications execute withlow core utilization (e.g., 4 of 8 cores of a multicore processor areactive).

In high volume manufacturing, most processors are capable of running atpeak frequencies (namely, a maximum peak frequency for the particularsilicon-based processor) that can easily exceed platform power deliveryconstraints when certain applications are running. This naturallycreates a non-determinism of software execution time. However, whenprocessor applications are running at lower core utilization, it istheoretically possible to run them at higher clock frequencies and stillensure determinism since the maximum possible power consumption of theprocessor package is still below voltage regulator, power supply andwall power delivery constraints. Embodiments thus provide techniques tolimit opportunistic processor operation to levels below any of theseconstraints.

Referring now to FIG. 1, shown is a block diagram of a portion of asystem in accordance with an embodiment of the present invention. Asshown in FIG. 1, system 100 may include various components, including aprocessor 110 which as shown is a multicore processor. Processor 110 maybe coupled to a power supply 150 via an external voltage regulator 160,which may perform a first voltage conversion to provide a primaryregulated voltage to processor 110.

As seen, processor 110 may be a single die processor socket includingmultiple cores 120 a-120 n. In addition, each core may be associatedwith an individual voltage regulator 125 a-125 n to allow forfine-grained control of voltage and thus power and performance of eachindividual core. As such, each core can operate at an independentvoltage and frequency, enabling great flexibility and affording wideopportunities for balancing power consumption with performance.

Still referring to FIG. 1, additional components may be present withinthe processor including an input/output interface 132, another interface134, and an integrated memory controller 136. As seen, each of thesecomponents may be powered by another integrated voltage regulator 125 x.In one embodiment, interface 132 may be in accordance with the Intel®Quick Path Interconnect (QPI) protocol, which provides forpoint-to-point (PtP) links in a cache coherent protocol that includesmultiple layers including a physical layer, a link layer and a protocollayer. In turn, interface 134 may be in accordance with a PeripheralComponent Interconnect Express (PCIe™) specification, e.g., the PCIExpress™ Specification Base Specification version 2.0 (published Jan.17, 2007).

Also shown is a power control unit (PCU) 138, which may includehardware, software and/or firmware to perform power managementoperations with regard to processor 110. In various embodiments, PCU 138may include logic to limit processor frequency and/or other operatingparameter below a supported level to a dynamically configurable limit inaccordance with an embodiment of the present invention. Furthermore, PCU138 may be coupled via a dedicated interface to external voltageregulator 160. In this way, PCU 138 can instruct the voltage regulatorto provide a requested regulated voltage to the processor.

While not shown for ease of illustration, understand that additionalcomponents may be present within processor 110 such as additional uncorelogic and other components such as internal memories, e.g., one or morelevels of a cache memory hierarchy and so forth. Furthermore, whileshown in the implementation of FIG. 1 with an integrated voltageregulator, embodiments are not so limited.

Although the following embodiments are described with reference toenergy conservation and energy efficiency in specific integratedcircuits, such as in computing platforms or processors, otherembodiments are applicable to other types of integrated circuits andlogic devices. Similar techniques and teachings of embodiments describedherein may be applied to other types of circuits or semiconductordevices that may also benefit from better energy efficiency and energyconservation. For example, the disclosed embodiments are not limited toany particular type of computer systems, and may be also used in otherdevices, such as handheld devices, systems on chip (SoCs), and embeddedapplications. Some examples of handheld devices include cellular phones,Internet protocol devices, digital cameras, personal digital assistants(PDAs), and handheld PCs. Embedded applications typically include amicrocontroller, a digital signal processor (DSP), network computers(NetPC), set-top boxes, network hubs, wide area network (WAN) switches,or any other system that can perform the functions and operations taughtbelow. Moreover, the apparatus′, methods, and systems described hereinare not limited to physical computing devices, but may also relate tosoftware optimizations for energy conservation and efficiency. As willbecome readily apparent in the description below, the embodiments ofmethods, apparatus′, and systems described herein (whether in referenceto hardware, firmware, software, or a combination thereof) are vital toa ‘green technology’ future, such as for power conservation and energyefficiency in products that encompass a large portion of the US economy.

Note that the configurable frequency and/or other operating parametercontrol described herein may be independent of and complementary to anoperating system (OS)-based mechanism, such as the AdvancedConfiguration and Platform Interface (ACPI) standard (e.g., Rev. 3.0b,published Oct. 10, 2006). According to ACPI, a processor can operate atvarious performance states or levels, namely from P0 to PN. In general,the P1 performance state may correspond to the highest guaranteedperformance state that can be requested by an OS. In addition to this P1state, the OS can further request a higher performance state, namely aP0 state. This P0 state may thus be an opportunistic or turbo mode statein which, when power and/or thermal budget is available, processorhardware can configure the processor or at least portions thereof tooperate at a higher than guaranteed frequency. In many implementations aprocessor can include multiple so-called bin frequencies above aguaranteed maximum frequency, also referred to as a P1 frequency,exceeding to a maximum peak frequency of the particular processor, asfused or otherwise written into the processor during manufacture. Inaddition, according to ACPI, a processor can operate at various powerstates or levels. With regard to power states, ACPI specifies differentpower consumption states, generally referred to as C-states, C0, C1 toCn states. When a core is active, it runs at a C0 state, and when thecore is idle it may be placed in a core low power state, also called acore non-zero C-state (e.g., C1-C6 states), with each C-state being at alower power consumption level (such that C6 is a deeper low power statethan C1, and so forth).

Embodiments provide an interface for an entity such as a software entityto control processor peak frequency level as a function of the number ofactive cores. By default, a processor is configured to operate withactive cores operating at frequencies up to the maximum frequencycapability of the silicon, where this maximum peak frequency isconfigured into the processor as one or more fused values (such as agiven maximum peak frequency for a given number of active cores).Typically, this maximum peak frequency corresponds to that available ina highest turbo mode when a processor is requested to operate at an ACPIP0 state. Note that this maximum frequency is thus higher than aguaranteed maximum frequency (such as an ACPI P1 state). Using anembodiment of the present invention, the maximum peak frequencyavailable to frequency control algorithms may be reduced or clipped bycausing lower per-core turbo frequency constraints to be used by thesealgorithms. In an embodiment, an interface may be provided to enablesoftware such as basic input/output system (BIOS) configuration code toset these clip values, also referred to herein as a clip or constraintfrequency. Although the particular example described herein is withregard to core domain control via core frequency control, understandthat similar techniques can be used to control other domains of aprocessor such as a graphics domain, interconnect domain, uncore domainand so forth.

Configurable per-core frequency limits may be provided that can beupdated as a function of the type of workload being executed, assumingthat a user has a priori knowledge of characteristics of the workload.Thus for some applications, a customer may configure these values basedon a priori knowledge of the application. In an embodiment, thesemaximum turbo frequency constraints (which are a set of constraints oftypically lower values than the processor-configured fused values formaximum peak frequency) may be configured dynamically at run-time.

This control interface can be used to cover all workloads genericallyand it may also be used to automatically calibrate peak performancelevels, assuming a user can predict the type of workloads it will run apriori. To execute this calibration, software may run a workload suiteand sweep the turbo frequency constraints to determine a failuresurface. Users may speed up the search process by employing a featurethat signals interrupts upon detection of a power or thermal constraint.In such cases, upon a single excursion above power or thermal limits, auser may dial back the peak per-core frequency constraints until thesoftware executes without an excursion.

Referring now to FIG. 2, shown is a block diagram illustrating aconfigurable peak performance limit control mechanism in accordance withan embodiment of the present invention. As shown in FIG. 2, logic 200may be part of a processor, and more particularly may be present in alogic of a PCU. In general, logic 200 operates to determine a maximumoperating frequency at which cores of the processor can operate, and tolimit this maximum frequency below manufacture-time fused values for theprocessor. Thus as seen in FIG. 2, a processor includes a peak frequencycapability storage 210. In an embodiment, this storage may store peakcapability information corresponding to a maximum operating frequencythat is a function of and/or otherwise dependent on a given number ofactive cores of a multicore processor. In the example shown, for anN-core processor, N values are provided where each value corresponds toa maximum operating frequency possible for the particular silicon-basedprocessor when the given number of cores is active. The capabilityinformation stored in storage 210 in an embodiment may be obtained fromfuses or other non-volatile storage of the processor as written or fusedduring manufacture of the semiconductor die.

To effect configurable user-controlled values below these capability ormaximum peak frequency values, a set of configurable frequency limitvalues may be stored in a storage 220. In an embodiment, there may be Nconfigurable values, with each corresponding to a configurable clip orconstraint frequency, as a function of the number of active cores. Notethat for the sets of frequencies present in both storage 210 and storage220, typically with a fewer number of active cores a higher operatingfrequency is possible. Thus when only one core is active, the operatingfrequency can be higher than when N cores are active. In an embodiment,these constraint frequency limits may be obtained in various manners,including as configuration values written during BIOS initialization,user-controlled values, e.g., based on a priori knowledge of a workloadto be executed on the processor, or so forth. In general, theseconfigurable limit values may be set to levels below the fused values.

As seen, logic 200 includes a min operator 230 to perform a minoperation between each of these configurable constraint frequency limitsand a corresponding peak frequency capability value such that the lesserof each of the two values for the corresponding given number of activecores can be stored in a corresponding field of a configuration storage240, also referred to herein as a resolved frequency limit storage. Asone example, this configuration storage may be a configuration registeravailable to the PCU that stores a turbo ratio limit value, alsoreferred to herein as a resolved frequency limit, for each possiblenumber of active cores.

Referring now to Table 1, shown is an example configuration registerarrangement to store a set of resolved frequency limits in accordancewith an embodiment of the present invention. As seen, each field of thisregister may store a resolved value of the set of such values.

TABLE 1 MSB LSB Field Name Description 7 0 1-core Frequency Controls themaximum clock Limit frequency when one core is active 15 8 2-coreFrequency Controls the maximum clock Limit frequency when two cores areactive 23 16 3-core Frequency Controls the maximum clock Limit frequencywhen three cores are active . . . . . . . . . Etc., scaling up to themaximum available core count of this processor

During operation, e.g., of a control loop of the PCU, and based on thecurrent number of active cores, a given one of these values stored inconfiguration storage 240 may be selected as the resolved frequencylimit to be the maximum turbo mode frequency at which active cores canoperate. Note that due to the configurable frequency limits, thismaximum turbo mode frequency is likely to be lower than a maximum peakfrequency according to the information stored in capability storage 210.For example, while for N active cores, storage 210 may store a maximumpeak frequency of 3.0 gigahertz (GHz) (as an example), insteadconfiguration storage 240 may store a resolved frequency limit for Nactive cores of 2.5 GHz or another frequency less than the maximum peakfrequency. Of course different frequencies are possible in differentimplementations.

Assume that three cores of the multicore processor are active. In thiscase, a min of the value stored in storage 220 corresponding to the3-active core frequency limit and the programmed frequency limit incapability storage 210 is determined, stored in configuration storage240, and used in PCU control operations to thus limit or clip operatingfrequency of these active cores to this min value. Assume next that acore, system software or other entity requests a performance state thatis associated with a higher operating frequency (such as a P0 state). Inthis case, the PCU does not allow this requested frequency and insteadlimits performance to that possible using the resolved value stored inconfiguration storage 240 for the active number of cores.

Although shown at this high level in the embodiment of FIG. 2,understand that the scope of the present invention is not limited inthis regard. For example, instead of frequency a different configurableparameter of a processor may be controlled such that a minimum of amaximum peak operating parameter value that is a function of an activitylevel of the processor (or one or more domains) and a configurable clipparameter value can be selected and used to limit an operating parameterof the processor. As examples in addition to frequency, suchconfigurable parameters may include instruction execution rate,retirement rate or other parameter to maximize performance within aconfigurable value that does not reach a power limit of the processor.

Furthermore understand that the representation shown in FIG. 2 is alogical view. That is, in some embodiments rather than providing forthese three different storages and a min operator only a singleconfiguration storage is present and during BIOS execution, thesilicon-configured values may be updated to lower values, namely theBIOS or user set configurable constraint frequency limits that thusoverwrite the fused values obtained from a non-volatile storage.

Referring now to FIG. 3, shown is a flow diagram of a method fordynamically limiting processor frequency in accordance with anembodiment of the present invention. As shown in FIG. 3, method 300 maybe performed within logic of a PCU, such as frequency limit controllogic. However, understand that in other embodiments this logic can beimplemented as a standalone logic or as part of another portion of aprocessor. As seen, method 300 begins by receiving configurablefrequency limit values from a software entity (block 310). For example,upon processor initialization the logic can receive these values fromBIOS. Or these values can be dynamically received during processorruntime, e.g., prior to execution of a particular application for whicha priori knowledge of its workload is available. In an embodiment, aconfigurable frequency limit value may be provided for each possiblecombination of active cores. For a processor with N cores, N such valuesmay be provided. Typically many or all of these values may be constraintvalues such that they are lower than a maximum peak frequency fused intothe processor.

Still referring to FIG. 3, control passes to block 320 where a lower ofone of these configurable frequency limit values and the correspondingmaximum peak frequency value can be stored into each field of aconfiguration storage. As an example, this configuration storage mayinitially store the maximum peak frequency values obtained from anon-volatile storage of the processor. Thus this operation at block 320may act to overwrite these maximum peak frequency values with theconfigurable frequency limit values. In other implementations a minoperation is performed to obtain these resolved values and store them inthe configuration storage. Thus at this point the configuration storageis ready to be accessed during normal operation.

Still referring to FIG. 3 the remaining operations relate to a normalprocessor operation in which an entity such as OS, driver or so forthissues a request for a thread to execute on a given core with aparticular performance level that in turn is associated with a givencore frequency of operation.

As part of handling that request, the logic can determine a number ofactive cores in the processor (block 340). Then it can be determined atdiamond 350 whether the N-core resolved value is less than the value forthe performance request, namely the operating frequency associated withthis request. Note that this performance request value may be directlyreceived from the entity, or it can be obtained via access to a lookuptable based on the performance request. If the determination of diamond350 is in the affirmative, the given domain operating frequency (e.g.,the particular core or more globally the entire core domain) can belimited to this resolved frequency value (block 360). This limitedfrequency may still provide for turbo mode operation for the processorat a level such that power and/or thermal constraints are not reached,enabling deterministic operation of an application or other workload. Assuch, multiple independent systems of a compute cluster can each executethe same application in a deterministic manner.

Still referring to FIG. 3, otherwise if the corresponding resolved valueis not less than the performance request, control passes to block 370where the domain operating frequency can be enabled at the requestedperformance level. Although shown at this high level in the embodimentof FIG. 3, understand the scope of the present invention is not limitedin this regard.

Embodiments can be implemented in processors for various marketsincluding server processors, desktop processors, mobile processors andso forth. Referring now to FIG. 4, shown is a block diagram of aprocessor in accordance with an embodiment of the present invention. Asshown in FIG. 4, processor 400 may be a multicore processor including aplurality of cores 410 a-410 n. In one embodiment, each such core may beof an independent power domain and can be configured to enter and exitactive states and/or turbo modes based on workload. The various coresmay be coupled via an interconnect 415 to a system agent or uncore 420that includes various components. As seen, the uncore 420 may include ashared cache 430 which may be a last level cache. In addition, theuncore may include an integrated memory controller 440, variousinterfaces 450 and a power control unit 455.

In various embodiments, power control unit 455 may include a frequencylimit control logic 459 in accordance with an embodiment of the presentinvention. As described above, this logic acts to dynamically limitmaximum operating frequencies to resolved values lower than maximum peakfrequency values.

With further reference to FIG. 4, processor 400 may communicate with asystem memory 460, e.g., via a memory bus. In addition, by interfaces450, connection can be made to various off-chip components such asperipheral devices, mass storage and so forth. While shown with thisparticular implementation in the embodiment of FIG. 4, the scope of thepresent invention is not limited in this regard.

Referring now to FIG. 5, shown is a block diagram of a multi-domainprocessor in accordance with another embodiment of the presentinvention. As shown in the embodiment of FIG. 5, processor 500 includesmultiple domains. Specifically, a core domain 510 can include aplurality of cores 510 ₀-510 _(n), a graphics domain 520 can include oneor more graphics engines, and a system agent domain 550 may further bepresent. In some embodiments, system agent domain 550 may execute at anindependent frequency than the core domain and may remain powered on atall times to handle power control events and power management such thatdomains 510 and 520 can be controlled to dynamically enter into and exithigh power and low power states. Each of domains 510 and 520 may operateat different voltage and/or power. Note that while only shown with threedomains, understand the scope of the present invention is not limited inthis regard and additional domains can be present in other embodiments.For example, multiple core domains may be present each including atleast one core.

In general, each core 510 may further include low level caches inaddition to various execution units and additional processing elements.In turn, the various cores may be coupled to each other and to a sharedcache memory formed of a plurality of units of a last level cache (LLC)540 ₀-540 _(n). In various embodiments, LLC 540 may be shared amongstthe cores and the graphics engine, as well as various media processingcircuitry. As seen, a ring interconnect 530 thus couples the corestogether, and provides interconnection between the cores, graphicsdomain 520 and system agent circuitry 550. In one embodiment,interconnect 530 can be part of the core domain. However in otherembodiments the ring interconnect can be of its own domain.

As further seen, system agent domain 550 may include display controller552 which may provide control of and an interface to an associateddisplay. As further seen, system agent domain 550 may include a powercontrol unit 555 which can include a frequency limit control logic 559in accordance with an embodiment of the present invention to enableconfigurable dynamic limiting of operating frequency as describedherein. In various embodiments, this logic may be configured as in FIG.2 and may execute the algorithm described above in FIG. 3.

As further seen in FIG. 5, processor 500 can further include anintegrated memory controller (IMC) 570 that can provide for an interfaceto a system memory, such as a dynamic random access memory (DRAM).Multiple interfaces 580 ₀-580 _(n) may be present to enableinterconnection between the processor and other circuitry. For example,in one embodiment at least one direct media interface (DMI) interfacemay be provided as well as one or more Peripheral Component InterconnectExpress (PCI Express™ (PCIe™)) interfaces. Still further, to provide forcommunications between other agents such as additional processors orother circuitry, one or more interfaces in accordance with an Intel®Quick Path Interconnect (QPI) protocol may also be provided. Althoughshown at this high level in the embodiment of FIG. 5, understand thescope of the present invention is not limited in this regard.

Embodiments may be implemented in many different system types. Referringnow to FIG. 6, shown is a block diagram of a system in accordance withan embodiment of the present invention. As shown in FIG. 6,multiprocessor system 600 is a point-to-point interconnect system, andincludes a first processor 670 and a second processor 680 coupled via apoint-to-point interconnect 650. As shown in FIG. 6, each of processors670 and 680 may be multicore processors, including first and secondprocessor cores (i.e., processor cores 674 a and 674 b and processorcores 684 a and 684 b), although potentially many more cores may bepresent in the processors. Each of the processors can include a PCU orother logic to perform frequency limiting responsive to control of asoftware or other entity, as described herein.

Still referring to FIG. 6, first processor 670 further includes a memorycontroller hub (MCH) 672 and point-to-point (P-P) interfaces 676 and678. Similarly, second processor 680 includes a MCH 682 and P-Pinterfaces 686 and 688. As shown in FIG. 6, MCH's 672 and 682 couple theprocessors to respective memories, namely a memory 632 and a memory 634,which may be portions of system memory (e.g., DRAM) locally attached tothe respective processors. First processor 670 and second processor 680may be coupled to a chipset 690 via P-P interconnects 662 and 664,respectively. As shown in FIG. 6, chipset 690 includes P-P interfaces694 and 698.

Furthermore, chipset 690 includes an interface 692 to couple chipset 690with a high performance graphics engine 638, by a P-P interconnect 639.In turn, chipset 690 may be coupled to a first bus 616 via an interface696. As shown in FIG. 6, various input/output (I/O) devices 614 may becoupled to first bus 616, along with a bus bridge 618 which couplesfirst bus 616 to a second bus 620. Various devices may be coupled tosecond bus 620 including, for example, a keyboard/mouse 622,communication devices 626 and a data storage unit 628 such as a diskdrive or other mass storage device which may include code 630, in oneembodiment. Further, an audio I/O 624 may be coupled to second bus 620.Embodiments can be incorporated into other types of systems includingmobile devices such as a smart cellular telephone, tablet computer,netbook, Ultrabook™, or so forth.

Embodiments may be used in many different types of systems. For example,in one embodiment a communication device can be arranged to perform thevarious methods and techniques described herein. Of course, the scope ofthe present invention is not limited to a communication device, andinstead other embodiments can be directed to other types of apparatusfor processing instructions, or one or more machine readable mediaincluding instructions that in response to being executed on a computingdevice, cause the device to carry out one or more of the methods andtechniques described herein.

Embodiments may be implemented in code and may be stored on anon-transitory storage medium having stored thereon instructions whichcan be used to program a system to perform the instructions. The storagemedium may include, but is not limited to, any type of disk includingfloppy disks, optical disks, solid state drives (SSDs), compact diskread-only memories (CD-ROMs), compact disk rewritables (CD-RWs), andmagneto-optical disks, semiconductor devices such as read-only memories(ROMs), random access memories (RAMs) such as dynamic random accessmemories (DRAMs), static random access memories (SRAMs), erasableprogrammable read-only memories (EPROMs), flash memories, electricallyerasable programmable read-only memories (EEPROMs), magnetic or opticalcards, or any other type of media suitable for storing electronicinstructions.

While the present invention has been described with respect to a limitednumber of embodiments, those skilled in the art will appreciate numerousmodifications and variations therefrom. It is intended that the appendedclaims cover all such modifications and variations as fall within thetrue spirit and scope of this present invention.

What is claimed is:
 1. A processor comprising: a semiconductor diecomprising: a first domain including a plurality of cores; a seconddomain including at least one graphics engine; a non-volatile storage tostore a plurality of maximum peak operating frequency values, whereineach of the maximum peak operating frequency values is a function of anumber of active cores of the processor; a storage to store a pluralityof configurable clip frequency values, based on a priori knowledge of aworkload of an application to be executed on the processor provided by auser; a configuration storage to store a plurality of frequency limits,each of the frequency limits corresponding to one of the maximum peakoperating frequency values or a configurable clip frequency value lessthan the maximum peak operating frequency value obtained from thestorage during runtime of the application, to place a limit on anoperating frequency, each of the non-volatile storage, the storage andthe configuration storage comprising a different storage; and a powercontroller to place the limit on the operating frequency of at least oneof the first domain and the second domain to a corresponding frequencylimit obtained from the configuration storage, wherein the powercontroller is to store in the configuration storage a minimum of thecorresponding maximum peak operating frequency value and theconfigurable clip frequency value.
 2. The processor of claim 1, whereinthe power controller is to overwrite one of the maximum peak operatingfrequency values stored in the configuration storage with theconfigurable clip frequency value.
 3. The processor of claim 1, whereinthe power controller is to use the configurable clip frequency value toenable the processor to enter into a higher performance state, but toprevent the processor from reaching a constraint of the processor. 4.The processor of claim 1, wherein the power controller is to use theconfigurable clip frequency value to enable a plurality of processors ofdifferent systems to each execute the application in a deterministicmanner.
 5. The processor of claim 1, wherein the power controller is toselect the operating frequency limit based on a number of active coresof the plurality of cores and prevent a first core from execution at arequested operating frequency when the requested operating frequency isgreater than the selected operating frequency limit.
 6. The processor ofclaim 1, wherein the configurable clip frequency value is based on the apriori knowledge according to execution of the workload using theplurality of maximum peak operating frequency values to determine afailure surface.
 7. A non-transitory machine-readable medium havingstored thereon instructions, which if performed by a machine cause themachine to perform a method comprising: storing a set of configurablefrequency limit values in a first storage of a multi-domain processor,the set of configurable frequency limit values obtained from a userprior to execution of a workload, based on a priori knowledge of theworkload; and storing a corresponding minimum one of the set ofconfigurable frequency limit values from the first storage or one of aset of maximum peak frequency values as a resolved value in each fieldof a configuration storage of the multi-domain processor, wherein theset of maximum peak frequency values are obtained from a non-volatilestorage of the multi-domain processor and the set of configurablefrequency limit values are stored in the first storage during runtime ofthe multi-domain processor to prevent the multi-domain processor fromreaching a constraint during turbo mode operation, wherein the firststorage, the configuration storage and the non-volatile storage comprisedifferent storages.
 8. The non-transitory machine-readable medium ofclaim 7, wherein the method further comprises: receiving a performancerequest for a core domain of the multi-domain processor during theruntime and determining a number of active cores of the core domain; anddetermining whether a field of the configuration storage correspondingto the number of active cores stores a resolved value less than anoperating frequency associated with the performance request, and if solimiting the operating frequency for the core domain to the resolvedvalue.
 9. The non-transitory machine-readable medium of claim 8, whereinthe method further comprises otherwise enabling the operating frequencyto be at the operating frequency associated with the performancerequest.
 10. The non-transitory machine-readable medium of claim 7,wherein the method further comprises receiving a performance request fora graphics domain of the multi-domain processor during the runtime anddetermining a number of active graphics processors of the graphicsdomain, and limiting an operating frequency for the graphics domainbased at least in part on the number of active graphics processors. 11.The non-transitory machine-readable medium of claim 7, wherein themethod further comprises overwriting one of the set of maximum peakfrequency values stored in a first field of the configuration storagewith a corresponding one of the set of configurable frequency limitvalues.
 12. A system comprising: a multicore processor including aplurality of cores, a non-volatile storage to store a plurality ofmaximum peak operating frequency values, wherein each of the maximumpeak operating frequency values is a function of a given number ofactive cores, a limit storage to store a plurality of configurable clipfrequency values, and a power control unit (PCU) having a control logicto store in each field of a configuration storage a correspondingminimum one of the maximum peak operating frequency values stored in thenon-volatile storage or a configurable clip frequency value stored inthe limit storage, the configurable clip frequency value based on apriori information regarding a workload to be executed on the multicoreprocessor, provided by a user prior to execution of the workload;provided by a user prior to execution of the workload, wherein thenon-volatile storage, the limit storage and the configuration storagecomprise different storages of the multicore processor; and a dynamicrandom access memory (DRAM) coupled to the multicore processor.
 13. Thesystem of claim 12, wherein the control logic is to perform the storageresponsive to execution of a first application including the workload.14. The system of claim 12, wherein a second system includes a secondmulticore processor, the second system and the system to execute thefirst application in a deterministic manner without reaching aconstraint of the multicore processor and the second multicoreprocessor.
 15. The system of claim 12, wherein the control logic is tooverwrite one of the maximum peak operating frequency values stored inthe configuration storage with the configurable clip frequency value.16. The system of claim 12, wherein the PCU is to enable the multicoreprocessor to enter into a higher performance state, and the configurableclip frequency value stored in the configuration storage is to preventthe multicore processor from reaching a constraint during execution ofthe workload.
 17. The system of claim 12, wherein the a prioriinformation is according to execution of the workload using theplurality of maximum peak operating frequency values to determine afailure surface.